Frequency offset correction in single sideband (SSB) speech by deep neural network for speaker verification

نویسندگان

  • Hua Xing
  • Gang Liu
  • John H. L. Hansen
چکیده

Communication system mismatch represents a major influence for loss in speaker recognition performance. This paper considers a type of nonlinear communication system mismatchmodulation/demodulation (Mod/DeMod) carrier drift in single sideband (SSB) speech signals. We focus on the problem of estimating frequency offset in SSB speech in order to improve speaker verification performance of the drifted speech. Based on a two-step framework from previous work, we propose using a multi-layered neural network architecture, stacked denoising autoencoder (SDA), to determine the unique interval of the offset value in the first step. Experimental results demonstrate that the SDA based system can produce up to a +16.1% relative improvement in frequency offset estimation accuracy. A speaker verification evaluation shows a +65.9% relative improvement in EER when SSB speech signal is compensated with the frequency offset value estimated by the proposed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Recognition of Speech Enhanced by Blind Compensation for Artifacts of Single-Sideband Demodulation

This paper concerns the automatic recognition of speech that has been distorted by frequency shifting introduced by a transmitter-receiver frequency mismatch in communications systems using single-sideband (SSB) modulation. The degradation in recognition accuracy depends both on the frequency shift induced by mistuned SSB and additive noise, with a reduction in SNR causing the degradation produ...

متن کامل

Time-Contrastive Learning Based Unsupervised DNN Feature Extraction for Speaker Verification

In this paper, we present a time-contrastive learning (TCL) based unsupervised bottleneck (BN) feature extraction method for speech signals with an application to speaker verification. The method exploits the temporal structure of a speech signal and more specifically, it trains deep neural networks (DNNs) to discriminate temporal events obtained by uniformly segmenting the signal without using...

متن کامل

Improving Speaker Verification for Reverberant Conditions with Deep Neural Network Dereverberation Processing

We present an improved method for training Deep Neural Networks for dereverberation and show that it can improve performance for the speech processing tasks of speaker verification and speech enhancement. We replicate recently proposed methods for dereverberation using Deep Neural Networks and present our improved method, highlighting important aspects that influence performance. We then experi...

متن کامل

Deep Neural Network Embeddings for Text-Independent Speaker Verification

This paper investigates replacing i-vectors for text-independent speaker verification with embeddings extracted from a feedforward deep neural network. Long-term speaker characteristics are captured in the network by a temporal pooling layer that aggregates over the input speech. This enables the network to be trained to discriminate between speakers from variablelength speech segments. After t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015